Skip to content

feat: add dynamic model selection for LLM queries#35

Merged
ChingEnLin merged 2 commits into
devfrom
worktree-multiple_model
May 20, 2026
Merged

feat: add dynamic model selection for LLM queries#35
ChingEnLin merged 2 commits into
devfrom
worktree-multiple_model

Conversation

@ChingEnLin

Copy link
Copy Markdown
Owner

Summary

  • Adds GET /query/models endpoint that fetches available Gemini models from the Google GenAI API (filtered to generateContent-capable models)
  • Threads an optional model field (default gemini-2.5-flash) through all AI request schemas (QueryPrompt, DebugQueryRequest, AnalyzeRequest, SchemaRelationshipsRequest, EvaluateWriteRequest) and all service functions
  • Updates all 10 previously hardcoded "gemini-2.5-flash" strings to use the passed-in model
  • Adds two model selector <select> dropdowns to the frontend (full-page layout and embedded/compact layout), populated dynamically from the API on load
  • Fixes test_debug_query assertion to match new call signature with model kwarg

Test plan

  • All 54 backend unit tests pass (pytest)
  • Black formatting passes
  • Model selector renders in the query generator UI below the iterations slider
  • Selecting a different model and generating a query uses that model on the backend
  • GET /query/models returns a list of Gemini model IDs
  • Fallback to gemini-2.5-flash if the models endpoint fails

🤖 Generated with Claude Code

Adds a GET /query/models endpoint that fetches available Gemini models
from the Google GenAI API, and threads a model parameter through all
AI service functions and request schemas so the user can switch models
at runtime. Two model selector dropdowns are added to the frontend UI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ChingEnLin

Copy link
Copy Markdown
Owner Author

@claude please review the changes in this PR

@claude

claude Bot commented May 20, 2026

Copy link
Copy Markdown

Claude finished @ChingEnLin's task in 1m 40s —— View job


PR Review: feat: add dynamic model selection for LLM queries

Todo List:

  • Load required tools
  • Read CLAUDE.md for repo guidelines
  • Analyze git diff of changes
  • Read changed files for detailed review
  • Post comprehensive review feedback

Summary

The PR is well-structured overall — the core approach of threading model as an optional parameter with a sensible default is clean and backward-compatible. The changes are consistent across all request schemas, service functions, and the frontend. A few issues are worth addressing before merge:


Issues Found

1. Double API call in /models fallback (Bug)

backend/routes/query.py:269-275

The fallback branch re-calls client.models.list() instead of reusing the already-fetched models iterable:

if not model_ids:
    # Fallback: return all gemini models even if supported_actions is missing
    model_ids = [
        m.name.replace("models/", "")
        for m in client.models.list()   # ← unnecessary second call
        if m.name and "gemini" in m.name.lower()
    ]

The models variable from line 260 already holds the list — reuse it:

if not model_ids:
    model_ids = [
        m.name.replace("models/", "")
        for m in models
        if m.name and "gemini" in m.name.lower()
    ]

Fix this →


2. Audit service not wired up (Incomplete)

backend/services/audit_service.py:76,96

The model parameter was added to generate_audit_sql and summarize_audit_results in gemini_service.py, but audit_service.py never passes it through. The PR description says "Updates all 10 previously hardcoded gemini-2.5-flash strings", but audit callers are untouched — the hardcoded defaults in those functions still take effect for audit queries.

If audit model selection is intentionally out of scope, the model parameter added to those two functions in gemini_service.py is premature/unused and should either be removed or a TODO comment added.


3. Unauthenticated /models endpoint

backend/routes/query.py:251-278

All other endpoints in this router require an authorization: str = Header(...) header and perform OBO token exchange. The new GET /query/models endpoint has no authentication. This leaks whether the backend Google API key is valid and exposes available model names to any caller.

Consider adding the same auth header requirement, or at minimum document the intentional decision to leave it open (e.g., models are non-sensitive and loaded on page mount before login completes).


4. Missing test for GET /query/models

backend/tests/test_query_routes.py

No test covers the new endpoint. At minimum, a test mocking client.models.list() to verify the filtering logic and the deduplication/sorting would catch regressions. The fallback behavior (empty supported_actions) is also worth a test case.


5. test_analyze_query doesn't assert model kwarg

backend/tests/test_query_routes.py:194

test_debug_query was correctly updated to assert model="gemini-2.5-flash" is passed, but test_analyze_query only calls mock_analyze.assert_called_once() without verifying the model kwarg. This is an inconsistency — if the route stops passing model, this test won't catch it.

Fix this →


Minor Notes

  • getAvailableModels missing dev-mode mock (frontend/services/geminiService.ts:17-25): Every other function in the file has a USE_MSAL_AUTH guard that returns mock data in dev mode. getAvailableModels goes straight to the real endpoint — this means dev builds will make a live network call on page load, which may fail silently or produce confusing logs.

  • Selected model not persisted: The selectedModel state resets to gemini-2.5-flash on page reload. Consider localStorage for persistence, especially since fetching the model list is async and the default is shown until it resolves.

  • Dual <select> sync: Both model dropdowns in the full-page and compact layout share selectedModel state, which is the right approach. No issue here, just confirming the wiring looks correct.


Verdict

The implementation is solid and the approach is well-thought-out. The double API call (item 1) and the audit service gap (item 2) are the most important to address before merge. The auth concern (item 3) is worth an explicit decision either way.

- Fix double API call in /models fallback by materializing the list once
- Move genai import to module level for testability
- Wire model param through audit_service and audit route
- Add two new tests for /models endpoint (normal + fallback paths)
- Fix test_analyze_query to assert model kwarg is passed
- Add USE_MSAL_AUTH dev-mode guard to getAvailableModels
- Persist selectedModel to localStorage so it survives page reloads

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ChingEnLin ChingEnLin self-assigned this May 20, 2026
@ChingEnLin ChingEnLin merged commit 4758c73 into dev May 20, 2026
3 checks passed
@github-actions

Copy link
Copy Markdown

🎉 This PR is included in version 2.11.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

@ChingEnLin ChingEnLin deleted the worktree-multiple_model branch May 25, 2026 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant